

**IN THE CLAIMS**

Please amend the claims as follows:

---

1. (Currently Amended) A method for performing a gather operation on a computer processor comprising:

computing addresses for a plurality of data elements of a matrix stored in memory, utilizing wherein each data element is identified by one of a plurality of indices and a base address, and wherein computing addresses comprises executing a first plurality of instructions to transfer a plurality of said indices from a first storage location where the indices are stored substantially contiguously, to an equal plurality of separate storage locations, wherein each index is assigned its own separate storage location;

*D*  
retrieving each of said data elements from memory based on the computed addresses; and

executing a second plurality of instructions, each instruction depositing one or more of said data elements contiguously with other data elements in a second storage location.

2. (Original) The method as in claim 1 wherein said storage locations are registers.

3. (Currently Amended) The method as in claim 1 wherein computing addresses further comprises:

~~extracting indices for each of said data elements into separate storage locations; and~~

adding each of said indices to a base address.

4. (Currently Amended) The method as in claim 1 further comprising:  
loading each of said data elements from memory into separate storage  
locations prior to executing said second plurality of instructions.

5. (Currently Amended) The method as in claim 1 wherein said computer  
processor executes two or more of said first and/or second plurality of  
instructions in a single clock cycle.

6. (Original) The method as in claim 1 further comprising:  
storing each of said data elements on a mass storage device.

7. (Original) The method as in claim 2 wherein said registers are 64-bits  
wide and said data elements are 16-bits in length.

8. (Currently Amended) A method for performing a scatter operation on a  
computer processor comprising:  
calculating addresses in memory to which a plurality of data elements are  
to be scattered to form a matrix in memory utilizing wherein each address in  
memory is identified by one of a plurality of indices and a base address;  
executing a plurality of extract instructions, each of said extract  
instructions extracting one or more of said data elements from a storage location  
in which said data elements are stored contiguously to an equal plurality of  
separate storage locations; and

storing transferring said data elements from said separate storage locations to said calculated addresses in memory.

9. (Currently Amended) The method as in claim 8 wherein each of said storage location is a register.

10. (Previously Presented) The method as in claim 8 wherein calculating addresses comprises:

extracting indices for each of said data elements into separate storage locations; and

adding each of said indices to a base address.

11. (Previously Presented) The method as in claim 8 wherein storing each of said data elements is accomplished via a plurality of STORE instructions executed by said computer processor.

12. (Previously Presented) The method as in claim 8 wherein said computer processor executes two or more of said instructions in a single clock cycle.

13. (Original) The method as in claim 9 wherein said register is 64-bits wide and said data elements are 16-bits in length.

14. (Currently Amended) A computer system comprising:

a memory;

a processor communicatively coupled to the memory; and

a storage device communicatively coupled to the processor and having stored therein a sequence of instructions which, when executed by the processor, causes the processor to at least,

compute addresses for a plurality of data elements of a matrix stored in memory, utilizing wherein each data element is identified by one of a plurality of indices and a base address, and wherein computing addresses comprises executing a first plurality of instructions to transfer a plurality of said indices from a first storage location where the indices are stored substantially contiguously, to an equal plurality of separate storage locations, wherein each index is assigned its own separate storage location;

retrieve each of said data elements from memory based on the computed addresses; and

execute a second plurality of instructions, each instruction to deposit one or more of said data elements contiguously with other data elements in a second storage location.

15. (Original) The computer system as in claim 14 wherein said storage locations are registers.

16. (Currently Amended) The computer system as in claim 14 wherein, responsive to one or more instructions in said sequence, said processor computes addresses by:

~~extracting indices for each of said data elements into separate storage locations; and~~

adding each of said indices to a base address.

17. (Currently Amended) The computer system as in claim 14 wherein said processor loads each of said data elements from memory into separate storage locations prior to executing said second plurality of **DEPOSIT** instructions.

18. (Currently Amended) The computer system as in claim 17 wherein said processor executes two or more of said first and/or second plurality of instructions in a single clock cycle.

19. (Original) The computer system as in claim 14 wherein, responsive to one or more instructions in said sequence, said processor further:  
stores each of said data elements on said mass storage device.

20. (Original) The computer system as in claim 15 wherein said registers are 64-bits wide and said data elements are 16-bits in length.

21. (Previously Presented) A method as in claim 1 wherein computing addresses comprises:

executing a series of instructions, each instruction to extract an address index for one of said plurality of data elements.

22. (Original) The method as in claim 21 wherein said address indices are extracted from a series of contiguous memory locations

23. (Cancelled)